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A  MODEL  FOR  UNCOORDINATED  DISTRIBUTED  COMPUTATION  OF  FIXED  POINTS* 

DIMITRI  P.  BERTSEKAS 

Laboratory  for  Information  and  Decision  Systems 

Department  of  Electrical  Engineering  and  Computer  Science 

Massachusetts  Institute  of  Technology 

Cambridge,  Massachusetts  02139 

ABSTRACT 

We  present  an  algorithmic  model  for  distributed  computation  of  fixed 
points  whereby  several  processors  participate  simultaneously  in  the  calcu¬ 
lations  while  exchanging  information  via  communication  links.  We  place  es¬ 
sentially  no  assumptions  on  the  ordering  of  computation  and  communication 
between  processors  thereby  allowing  for  completely  uncoordinated  execution. 

We  find  that  even  under  these  potentially  chaotic  circumstances  it  is  pos¬ 
sible  to  solve  several  important  classes  of  problems  including  the  calcu¬ 
lation  of  fixed  points  of  contraction  and  monotone  mappings  arising  in 
linear  and  nonlinear  systems  of  equations,  shortest  path  problems,  and 
dynamic  programming. 

X'  1 .  INTRODUCTION 

There  is  presently  a  great  deal  of  interest  in  distributed  implementa¬ 
tions  of  various  iterative  algorithms  whereby  the  computational  load  is 
shared  by  several  processors  while  coordination  is  maintained  by  informa¬ 
tion  exchange  via  communication  links.  In  most  of  the  work  done  in  this 
area  the  starting  point  is  some  iterative  algorithm  which  is  guaranteed  to 
converge  to  the  correct  solution  under  the  usual  circumstances  of  central¬ 
ized  computation  in  a  single  processor.  The  computational  load  of  the 
typical  iteration  is  then  divided  in  some  way  between  the  available  proces-  / 
sors,  and  it  is  assumed  that  the  processors  exchange  all  necessary  infor¬ 
mation  regarding  the  outcomes  of  the  current  iteration  before  a  new  itera¬ 
tion  can  begin. 

The  mode  of  operation  described  above  may  be  termed  synchronous  in  the 
sense  that  each  processor  must  complete  its  assigned  -portion  of  an  itera¬ 
tion  and  communicate  the  results  to  every  other  processor  before  a  new  it¬ 
eration  can  begin.  This  assumption  certainly  enhances  the  orderly  opera¬ 
tion  of  the  algorithm  and  greatly  simplifies  the  convergence  analysis.  On 
the  other  hand  synchronous  distributed  algorithms  also  have  some  obvious 
implementation  disadvantages  such  as  the  need  for  an  algorithm  initiation 
and  iteration  synchronization  protocol.  Furthermore  the  speed  of  computa¬ 
tion  is  limited  to  that  of  the  slowest  processor.  It  is  thus  interesting 
to  consider  algorithms  that  can  tolerate  a  more  flexible  ordering  of  com¬ 
putation  and  communication  between  processors.  Such  algorithms  have  so  far 
found  applications  in  computer  communication  networks  like  the  ARPANET  [1J 
where  processor  failures  are  common  and  it  is  quite  complicated  to  maintain 
synchronization  between  the  nodes  of  the  entire  network  as  they  execute 
real-time  network  functions  such  as  the  routing  algorithm. 

Processor  network  environments  for  which  weakly  coordinated  distributed 
computation  seems  particularly  advantageous  typically  possess  one  or  more 
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of  the  following  characteristics  all  of  which  involve  occurance  of  some 
type  of  unpredictable  event. 

1)  Computation  nodes  and  communication  links  are  subject  to  frequent  and/ 
or  unexpected  failures.  (For  example  packet  radio  networks). 

2)  Computation  nodes  have  different  and/or  time  varying  speeds  of  execu¬ 
tion.  (For  example  each  processor  is  assigned  to  a  perhaps  time  varying 
number  of  tasks  involving  computation  loads  which  are  not  fixed  a  priori) . 

3)  Computation  at  various  nodes  is  event  driven.  (For  example  in  data 
collection  or  sensor  networks  where  the  timing,  and  ordering  of  measure¬ 
ments  may  not  be  predictable) . 

It  is  possible  to  consider  various  degrees  of  coordination  in  different 
types  of  distributed  algorithms.  An  interesting  question  is  to  determine 
the  minimum  degree  of  coordination  needed  in  a  given  algorithm  in  order  to 
obtain  the  correct  solution.  To  this  end  we  consider  an  extreme  model  of 
uncoordinated  distributed  algorithms  whereby  computation  and  communication 
are  performed  at  each  processor  completely  independently  of  the  progress  in 
other  processors.  It  is  perhaps  surprising  that  even  under  these  chaotic 
circumstances  it  is  still  possible  to  solve  correctly  important  classes  of 
fixed  point  problems.  The  complete  analysis  is  given  in  Ref.  [2]  for  broad 
classes  of  dynamic  programming  and  in  Ref.  [3]  for  more  general  fixed  point 
problems  involving  contraction  and  monotonicity  assumptions.  In  the  present 
paper  we  describe  the  algorithmic  model  and  indicate  the  type  of  results 
that  can  be  shown  in  some  generality. 

2.  A  MODEL  FOR  DISTRIBUTED  UNCOORDINATED  FIXED  POINT  ALGORITHMS 

The  fixed  point  problem  considered  in  this  paper  is  defined  in  terms 
of  a  set  X,  a  class  F  of  functions  mapping  X  into  the  extended  real  line 
[-°“,+00],  and  a  mapping  T  which  maps  F  into  itself.  We  wish  to  find  an 
element  J*  of  F  such  that 

J*  =  T (J*) .  (1) 

or  equivalently 

J*(x)  =  T(J*)(x),  VxeX,  (2) 

where  J*(x)  and  T(J*)(x)  denote  the  values  of  the  functions  J*  and  T(J*) 
respectively  at  the  typical  element  xeX.  We  will  assume  throughout  that  T 
has  a  unique  fixed  point  J*  within  the  set  F. 

We  provide  some  examples: 

Example  1:  (Fixed  points  of  mappings  on  Rn) .  Let  X  be  the  finite  set 
X  =  {1,2, ...,n), 

and  F  be  the  set  of  all  real-valued  functions  on  X.  Then  F  can  be  identi- 

tied  with  the  n-dimensional  space  Rn  in  the  sense  that  with  each  JeF  we  can 
associate  the  n-dimensional  vector  J(l)  ,J(2) , . . . , J(n)  .  Similarly  T(J) 
can  be  identified  with  the  n-dimensional  vector  T(J) (1) , . . . ,T(J) (n) ,  so  the 
fixed  point  problem  (1)  amounts  to  solving  the  system  of  n  equations 

J*  =  T(J*)  or  J*  (i)  *  T(J*)(i),  Vi  =  l,...,n  (3) 


with  the  n  unknowns  J*(l) , . . . ,J*(n) .  It  is  also  evident  that  an y  system 
of  (possibly  nonlinear)  equations  with  n  unknowns  can  be  formulated  into 
a  fixed  point  problem  such  as  (3) . 

Example  2:  (Shortest  path  problems) .  Let  (W, L)  be  a  directed  graph  where 
H  =  (1,2, .. . ,n}  denotes  the  set  of  nodes  and  L  denotes  the  set  of  links. 

Let  N(i)  denote  the  downstream  neighbors  of  node  i,  i.e.,  the  set  of  nodes 
j  for  which  (i,j)  is  a  link.  Assume  that  each  link  (i,j)  is  assigned  a 
positive  scalar  a^  referred  to  as  its  length.  Assume  also  that  there  is 

a  directed  path  to  node  1  from  every  other  node.  Then  it  is  known  ([2], 
p,67)  that  the  shortest  path  distances  J*(i)  to  node  1  from  all  other  nodes 
i  solve  uniquely  the  equations. 

J*(i)  =  min  (a..  +  J*(j)}  ,  i  ^  1  (4a) 

jeN(i)  13 

J*(l)  =  0  (4b) 

If  we  make  the  identifications  X  =  {1,2, . . . ,n),  F:  Set  of  all  functions 
mapping  X  into  [0,+®],  and  define  T(J)  for  all  JeF  by  means  of 

{min  {a  +  J(j)}  if  i  ^  1  (5) 

jeN(i)  13 

0  if  i  =  1 

then  we  find  that  the  fixed  point  problem  (2)  reduces  to  the  shortest  path 
problem. 

The  shortest  path  problem  above  is  representative  of  a  broad  class  of 
dynamic  programming  problems  which  can  be  viewed  as  special  cases  of  the 
fixed  point  problem  (2)  and  can  be  correctly  solved  by  using  the  distributed 
algorithms  of  this  paper  (see  [3]). 

Our  algorithmic  model  can  be  described  in  terns  of  a  collection  of  n 

computation  centers  (or  processors)  referred  to  as  nodes  and  denoted 

1,2, _ ,n„  The  set  X  is  partitioned  into  n  disjoint  sets  denoted  X  ,..., 

X  ,  i.e.  1 

n 

n 

X  =  U  X.  ,  X.  n  x.  =  0,  if  i  t  j. 
i-1  1  1  3 

Each  node  i  is  assigned  the  responsibility  of  computing  the  values  of  the 
solution  function  J*  [c.f.  (1),(2)]  at  all  xeX^. 

At  each  time  instant,  node  i  can  be  in  one  of  three  possible  states 

compute ,  transmit ,  or  idle.  In  the  compute  state  node  i  computes  a  new 

estimate  of  the  values  of  the  solution  function  J*  for  all  xtX..  In  the 

i 

transmit  state  node  i  communicates  the  estimate  obtained  from  the  latest 
computation  to  one  or  more  nodes  j  (j^i).  In  the  idle  state  node  i  does 
nothing  related  to  the  solution  of  the  problem.  It  is  assumed  that  a  node 
can  receive  a  transmission  from  other  nodes  simultaneously  with  computing 
or  transmitting,  but  this  is  not  a  real  restriction  since,  if  needed,  a 
time  period  in  a  separate  receive  state  can  be  lumped  into  a  time  period  in 


the  idle  state 


We  assume  that  computation  and  transmission  for  each  node  takes  place 
in  uninterupted  time  intervals  [t^t^]  with  t^  <  t^,  but  do  not  exclude  the 

possibility  that  a  node  may  be  simultaneously  transmitting  to  more  than  one 
nodes  nor  do  we  assume  that  the  transmission  invervals  to  these  nodes  have 
the  same  origin  and/or  termination.  We  also  make  no  assumptions  on  the 
length,  timing  and  sequencing  of  computation  and  transmission  intervals 
other  than  the  following: 

Assumption  (A) :  There  exists  a  positive  scalar  P  such  that,  for  every  node 
i,  every  time  interval  of  length  P  contains  at  least  one  computation  inter¬ 
val  for  i  and  at  least  one  transmission  interval  from  i  to  each  node  j  j*  i. 


Each  node  i  also  has  a  buffer  B^  for  each  j  /  i  where  it  stores  the 

latest  transmission  from  j,  as  well  as  a  buffer  B^  where  it  stores  its  own 

estimate  of  values  of  the  solution  function  for  all  xeX. .  The  contents  of 

t  t  1 

each  buffer  B. .  at  time  t  are  denoted  J. ..  Thus  J. .  is,  for  every  t,  a 
H  il 

function  from  X^.  into  [-00,  00]  and  may  be  viewed  as  the  estimate  by  node  i  of 
the  restriction  of  the  solution  function  J*  on  X.  available  at  time  t.  The 

t  3 

rules  according  to  which  the  functions  J„  are  updated  are  as  follows: 

1)  If  [t  ,t_]  is  a  transmission  interval  from  node  j  to  node  i  the  contents 

h 

J..  of  the  buffer  B. .  at  time  t,  are  transmitted  and  entered  in  the  buffer 
11  11  1 
B. .  at  time  t_,  i.e. 

H  2’ 


Z2  *1 

j.:  =  J.t. 

11  11 


C6) 


2)  If  [t..,t_]  is  a  computation  interval  for  node  i  the  contents  of  buffer 

h 

B^  at  time  t^  are  replaced  by  the  restriction  of  the  function  T(J^  )  on 
X.  where,  for  all  t,  j5  is  defined  by 

l  i 

{j5.(x)  if  xeX. 

11  i 

t  W 

J^Cx)  if  xeXj ,  j  i  i 

In  other  words  we  have 


J.^(x) 

li 


)(x), 


VxeX. 

i 


(8) 


3)  The  contents  of  a  buffer  can  change  only  at  the  end  of  a  computa¬ 
tion  interval  for  node  i.  The  contents  of  a  buffer  B^.,  j  /  i  can  change 
only  at  the  end  of  a  transmission  interval  from  j  to  i. 

Additional  conditions  under  which  there  holds 


J*(X) 


V  xeX^,  i  *  1, . . . , 


(9) 


lim  j5(x) 

£-WO  1 


n 


may  be  found  in  [2], [3].  An  interesting  aspect  of  results  of  this  type  is 
that  they  do  not  require  that  the  initial  processor  buffer  contents  be 
identical  and  indeed  these  initial  conditions  can  vary  within  a  broad  range. 
This  means  that  for  problems  that  are  being  solved  continuously  in  real 
time  it  is  not  necessary  to  reset  the  initial  conditions  and  resynchronize 
the  algorithm  each  time  the  problem  data  changes.  As  a  result  the  potential 
for  tracking  slow  variations  bn  the  solution  function  is  improved  and  algo¬ 
rithms  implementation  is  considerably  simplified. 
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